Analyzing Trends in House of Representatives from 2016-2020

INFO 526 - Project 1

Author

The Power of The Voters

Abstract

This study explores US House election results from 1976 to 2022, focusing on US House of Representative trends from 2016-2022, and state-wide trends in Arizona from 2012-2022. From 2016-2020, the data was wrangled into three subsets, one for each cycle. To simplify the analysis, all parties that are outside of Republican and Democrat were grouped into a new generalized party called “Other”. These subsets were then displayed on a US map, where the fill of the state indicates which party has the house majority, while the label of the state indicates which party had the popular vote majority. This revealed insights into how the electoral system can lead to politicians that may not represent the sentiment of the population of the state. Zooming in on the state of Arizona, voting trends were analyzed over a longer time frame of 10 years, from 2012 to 2022. The mode of voting as well as the change in party in each district was visualized to create insights into how mode of voting will influence the winning candidate, and how that influences the winning party in each district.

Insights into voting trends are very important metrics for politicians and the general population as it can be directly be related to other aspects of life such as the economy and global politics. These insights can then help politicians target states where voter sentiment can make or break a politicians campaign.

The analysis focused on time-series analysis as the variables were view over time and the changes noted as valuable insights. The limitations for this project include the assumption that the conglomeration of minor parties will create a larger third party that will vote different from Democrat and Republican, though this is not the case in real life, where some minor parties will align closer to the larger parties and some will votes on their own ideas. Ulitmately, This study provides insights into US voting patterns and the impact of election results on future voting trends.

Introduction

Introducing the Dataset

The dataset, US House Election Results is sourced from MIT Election Data and Science Lab (MEDSL), offers a comprehensive overview of US House elections.

This dataset contains observations for elections held over 47 years from 1976 to 2022, encompassing a total of 32,452 recorded events. Each event is represented as a row with 20 attributes as columns. These columns provide details including the year, state, district, political party, candidate’s name, votes received, and various indicators such as whether it was a runoff election or if it was a write-in candidate.

EDA

# A tibble: 32,452 × 21
    year state   state_po state_fips state_cen state_ic office   district stage
   <dbl> <chr>   <chr>         <dbl>     <dbl>    <dbl> <chr>       <dbl> <chr>
 1  1976 ALABAMA AL                1        63       41 US HOUSE        1 GEN  
 2  1976 ALABAMA AL                1        63       41 US HOUSE        1 GEN  
 3  1976 ALABAMA AL                1        63       41 US HOUSE        1 GEN  
 4  1976 ALABAMA AL                1        63       41 US HOUSE        2 GEN  
 5  1976 ALABAMA AL                1        63       41 US HOUSE        2 GEN  
 6  1976 ALABAMA AL                1        63       41 US HOUSE        2 GEN  
 7  1976 ALABAMA AL                1        63       41 US HOUSE        3 GEN  
 8  1976 ALABAMA AL                1        63       41 US HOUSE        3 GEN  
 9  1976 ALABAMA AL                1        63       41 US HOUSE        3 GEN  
10  1976 ALABAMA AL                1        63       41 US HOUSE        4 GEN  
# ℹ 32,442 more rows
# ℹ 12 more variables: runoff <dbl>, special <dbl>, candidate <chr>,
#   party <chr>, writein <dbl>, mode <chr>, candidatevotes <dbl>,
#   totalvotes <dbl>, unofficial <dbl>, version <dbl>, fusion_ticket <dbl>,
#   State_Population <chr>
Summary Statistics for year :
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1976    1988    2000    2000    2012    2022 

Summary Statistics for state :
   Length     Class      Mode 
    32452 character character 

Summary Statistics for state_po :
   Length     Class      Mode 
    32452 character character 

Summary Statistics for state_fips :
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00   17.00   31.00   28.76   40.00   56.00 

Summary Statistics for state_cen :
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  11.00   23.00   52.00   50.95   74.00   95.00 

Summary Statistics for state_ic :
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00   14.00   40.00   37.09   52.00   82.00 

Summary Statistics for office :
   Length     Class      Mode 
    32452 character character 

Summary Statistics for district :
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   3.000   6.000   9.848  13.000  53.000 

Summary Statistics for stage :
   Length     Class      Mode 
    32452 character character 

Summary Statistics for runoff :
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
0.0000000 0.0000000 0.0000000 0.0002465 0.0000000 1.0000000 

Summary Statistics for special :
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.000000 0.000000 0.000000 0.002773 0.000000 1.000000 

Summary Statistics for candidate :
   Length     Class      Mode 
    32452 character character 

Summary Statistics for party :
   Length     Class      Mode 
    32452 character character 

Summary Statistics for writein :
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.00000 0.00000 0.00000 0.08412 0.00000 1.00000 

Summary Statistics for mode :
   Length     Class      Mode 
    32452 character character 

Summary Statistics for candidatevotes :
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
     -1    4324   57328   66825  112144 1165136 

Summary Statistics for totalvotes :
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
     -1  162266  206983  215165  263386 2656104 

Summary Statistics for unofficial :
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.000000 0.000000 0.000000 0.001202 0.000000 1.000000 

Summary Statistics for version :
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
20230706 20230706 20230706 20230706 20230706 20230706 

Summary Statistics for fusion_ticket :
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.00000 0.00000 0.00000 0.08135 0.00000 1.00000 

Summary Statistics for State_Population :
   Length     Class      Mode 
    32452 character character 
No columns have null values.

Question 2 : How often did change occur in House representation from the years 2012-2022 in the state of Arizona and which voting methods played a significant role in these elections?

Approach

The analysis will look at election results from 2012, 2016, and 2022 to see how Arizona’s congressional district alignments have changed over time. The goal is to identify patterns of political representation inside the state’s districts, detecting movements in party control over both short-term and decade-long periods. Furthermore, the analysis will look into the influence of various voting procedures, specifically their impact on election outcomes in Arizona.

Analysis

2012

The resultant visualisation is a color-coded map of Arizona’s congressional districts in 2012, based on the political party that won each seat. Each district is assigned a colour that represents the winning party: blue for Democrats and red for Republicans. The representation offers a visual picture of Arizona’s political distribution in 2012, highlighting regions of Democratic and Republican strength and providing insights into regional political processes. This visualisation reveals the political leaning of most of the congressional districts in Arizona during that election cycle was towards the Democratic party.

2016

The resultant visualisation is a color-coded map of Arizona’s congressional districts in 2016, based on the political party that won each seat. Each district is assigned a colour that represents the winning party: blue for Democrats and red for Republicans. One key feature of this visualisation is the use of transparency to highlight shifts in political representations. Districts with a party change are displayed with a lower opacity, distinguishing them from those with no change, which keep full-colour saturation. This visualisation reveals that the political leaning of 6th congressional district during the 2016 election cycle has changed from Democratic party to Republican party.

2022

The resultant visualisation is a color-coded map of Arizona’s congressional districts in 2022, based on the political party that won each seat. Each district is assigned a colour that represents the winning party: blue for Democrats and red for Republicans. One key feature of this visualisation is the use of transparency to highlight shifts in political representations. Districts with a party change are displayed with a lower opacity, distinguishing them from those with no change, which keep full-colour saturation. This visualisation reveals that the political leaning of 1st and 9th districts has changed from Democratic party to Republican party.4th congressional district has changed from Republican party to Democratic party.

Discussion

An interesting trend emerges from examining Arizona’s congressional district maps: the political setting rarely changes over brief periods, such as between two elections that happened just a few years apart(2012 and 2016). This consistency points strong party loyalty at these times. But a decade of observation tells a different story: there are notable changes in party representation in different districts. This discovery highlights a slow but significant shift in voter attitude and party affiliations. Even though it is rare for districts to switch parties in the course of one or two election cycles, a ten-year period shows a dynamic and ever-changing political landscape. This long-term perspective on the ups and downs of political change allows for a better understanding of the complexity and gradual alterations in voter preferences and party power.

ANY CODE BELOW THIS SECTION ADD TO THE ABOVE SECTIONS

EDA for voting methods

# A tibble: 32,452 × 21
    year state   state_po state_fips state_cen state_ic office   district stage
   <dbl> <chr>   <chr>         <dbl>     <dbl>    <dbl> <chr>       <dbl> <chr>
 1  1976 ALABAMA AL                1        63       41 US HOUSE        1 GEN  
 2  1976 ALABAMA AL                1        63       41 US HOUSE        1 GEN  
 3  1976 ALABAMA AL                1        63       41 US HOUSE        1 GEN  
 4  1976 ALABAMA AL                1        63       41 US HOUSE        2 GEN  
 5  1976 ALABAMA AL                1        63       41 US HOUSE        2 GEN  
 6  1976 ALABAMA AL                1        63       41 US HOUSE        2 GEN  
 7  1976 ALABAMA AL                1        63       41 US HOUSE        3 GEN  
 8  1976 ALABAMA AL                1        63       41 US HOUSE        3 GEN  
 9  1976 ALABAMA AL                1        63       41 US HOUSE        3 GEN  
10  1976 ALABAMA AL                1        63       41 US HOUSE        4 GEN  
# ℹ 32,442 more rows
# ℹ 12 more variables: runoff <dbl>, special <dbl>, candidate <chr>,
#   party <chr>, writein <dbl>, mode <chr>, candidatevotes <dbl>,
#   totalvotes <dbl>, unofficial <dbl>, version <dbl>, fusion_ticket <dbl>,
#   State_Population <chr>

Feature Extaraction and Data wrangling for voting methods

  1. Feature Extraction : Created 2 new Features - Result( Win and Loss ) and Type_of_Voting , these were done by converting all categorical values in numerical and Created Function

    Extracting 1st Feature : Type_of_Voting

    # A tibble: 32,452 × 22
        year state   state_po state_fips state_cen state_ic office   district stage
       <dbl> <chr>   <chr>         <dbl>     <dbl>    <dbl> <chr>       <dbl> <chr>
     1  1976 ALABAMA AL                1        63       41 US HOUSE        1 GEN  
     2  1976 ALABAMA AL                1        63       41 US HOUSE        1 GEN  
     3  1976 ALABAMA AL                1        63       41 US HOUSE        1 GEN  
     4  1976 ALABAMA AL                1        63       41 US HOUSE        2 GEN  
     5  1976 ALABAMA AL                1        63       41 US HOUSE        2 GEN  
     6  1976 ALABAMA AL                1        63       41 US HOUSE        2 GEN  
     7  1976 ALABAMA AL                1        63       41 US HOUSE        3 GEN  
     8  1976 ALABAMA AL                1        63       41 US HOUSE        3 GEN  
     9  1976 ALABAMA AL                1        63       41 US HOUSE        3 GEN  
    10  1976 ALABAMA AL                1        63       41 US HOUSE        4 GEN  
    # ℹ 32,442 more rows
    # ℹ 13 more variables: runoff <dbl>, special <dbl>, candidate <chr>,
    #   party <chr>, writein <dbl>, mode <chr>, candidatevotes <dbl>,
    #   totalvotes <dbl>, unofficial <dbl>, version <dbl>, fusion_ticket <dbl>,
    #   State_Population <chr>, Type_of_Voting <chr>

    A “Type_of_Voting” column has been added to the dataset which categorizes entries as normal elections, fusion tickets, runoffs, special elections, write-ins, and unofficial results. This feature aids the analysis by emphasizing the multitude and distinctiveness of the election processes reflected in the data.

Extracting 2nd Feature : Result( Win/Loss)

spc_tbl_ [32,452 × 22] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ year            : num [1:32452] 1976 1976 1976 1976 1976 ...
 $ state           : chr [1:32452] "ALABAMA" "ALABAMA" "ALABAMA" "ALABAMA" ...
 $ state_po        : chr [1:32452] "AL" "AL" "AL" "AL" ...
 $ state_fips      : num [1:32452] 1 1 1 1 1 1 1 1 1 1 ...
 $ state_cen       : num [1:32452] 63 63 63 63 63 63 63 63 63 63 ...
 $ state_ic        : num [1:32452] 41 41 41 41 41 41 41 41 41 41 ...
 $ office          : chr [1:32452] "US HOUSE" "US HOUSE" "US HOUSE" "US HOUSE" ...
 $ district        : num [1:32452] 1 1 1 2 2 2 3 3 3 4 ...
 $ stage           : chr [1:32452] "GEN" "GEN" "GEN" "GEN" ...
 $ runoff          : num [1:32452] 0 0 0 0 0 0 0 0 0 0 ...
 $ special         : num [1:32452] 0 0 0 0 0 0 0 0 0 0 ...
 $ candidate       : chr [1:32452] "BILL DAVENPORT" "JACK EDWARDS" "WRITEIN" "J CAROLE KEAHEY" ...
 $ party           : chr [1:32452] "DEMOCRAT" "REPUBLICAN" "WRITE-IN (INDEPENDENT)" "DEMOCRAT" ...
 $ writein         : num [1:32452] 0 0 1 0 0 1 0 0 1 0 ...
 $ mode            : chr [1:32452] "TOTAL" "TOTAL" "TOTAL" "TOTAL" ...
 $ candidatevotes  : num [1:32452] 58906 98257 7 66288 90069 ...
 $ totalvotes      : num [1:32452] 157170 157170 157170 156362 156362 ...
 $ unofficial      : num [1:32452] 0 0 0 0 0 0 0 0 0 0 ...
 $ version         : num [1:32452] 20230706 20230706 20230706 20230706 20230706 ...
 $ fusion_ticket   : num [1:32452] 0 0 0 0 0 0 0 0 0 0 ...
 $ State_Population: chr [1:32452] "#N/A" "#N/A" "#N/A" "#N/A" ...
 $ Type_of_Voting  : chr [1:32452] "Normal" "Normal" "writein" "Normal" ...
 - attr(*, "spec")=
  .. cols(
  ..   year = col_double(),
  ..   state = col_character(),
  ..   state_po = col_character(),
  ..   state_fips = col_double(),
  ..   state_cen = col_double(),
  ..   state_ic = col_double(),
  ..   office = col_character(),
  ..   district = col_double(),
  ..   stage = col_character(),
  ..   runoff = col_double(),
  ..   special = col_double(),
  ..   candidate = col_character(),
  ..   party = col_character(),
  ..   writein = col_double(),
  ..   mode = col_character(),
  ..   candidatevotes = col_double(),
  ..   totalvotes = col_double(),
  ..   unofficial = col_double(),
  ..   version = col_double(),
  ..   fusion_ticket = col_double(),
  ..   State_Population = col_character()
  .. )
 - attr(*, "problems")=<externalptr> 
# A tibble: 32,452 × 23
    year state   state_po state_fips state_cen state_ic office   district stage
   <dbl> <chr>   <chr>         <dbl>     <dbl>    <dbl> <chr>       <dbl> <chr>
 1  1976 ALABAMA AL                1        63       41 US HOUSE        1 GEN  
 2  1976 ALABAMA AL                1        63       41 US HOUSE        1 GEN  
 3  1976 ALABAMA AL                1        63       41 US HOUSE        1 GEN  
 4  1976 ALABAMA AL                1        63       41 US HOUSE        2 GEN  
 5  1976 ALABAMA AL                1        63       41 US HOUSE        2 GEN  
 6  1976 ALABAMA AL                1        63       41 US HOUSE        2 GEN  
 7  1976 ALABAMA AL                1        63       41 US HOUSE        3 GEN  
 8  1976 ALABAMA AL                1        63       41 US HOUSE        3 GEN  
 9  1976 ALABAMA AL                1        63       41 US HOUSE        3 GEN  
10  1976 ALABAMA AL                1        63       41 US HOUSE        4 GEN  
# ℹ 32,442 more rows
# ℹ 14 more variables: runoff <dbl>, special <dbl>, candidate <chr>,
#   party <chr>, writein <dbl>, mode <chr>, candidatevotes <dbl>,
#   totalvotes <dbl>, unofficial <dbl>, version <dbl>, fusion_ticket <dbl>,
#   State_Population <chr>, Type_of_Voting <chr>, RESULT <chr>

A new column titled RESULT is added to the datset, labelling items as “Win” if they have the most votes according to the mentioned criteria, and “Loss” if they do not fulfil these standards. This efficiently separates winning and losing candidates based on vote count, contributing useful categorization to the dataset.

  1. Data Wrangling :

Analyzing and filtering Arizona’s vote data from 2012 to 2022, brings emphasis on the various methods of voting and how it effects the outcomes of candidates. By focusing on particular details—year, voting type, and election results—it determines both the frequency and percentage of each vote result. This data allows to visualize the Arizona district electoral evolution over that decade.

Data Visualization

Discussion

The visual representation of the data offers a detailed description of election outcomes by voting type from 2012 to 2022, distinguishing between victories and defeats through color-coded bars and labeled percentages. The analysis of Arizona’s electoral data over the past decade reveals varying success rates for conventional voting methods, marked by a significant decline in 2020 followed by a notable recovery in 2022. In contrast, candidates utilizing write-in approaches consistently faced electoral setbacks year after year. This gap implies that traditional voting procedures, despite their continuous dominance, are not effective entirely, while also emphasising the ongoing difficulty that write-in candidates have in gaining political success in Arizona. These visual representations showcase shifts in electoral outcomes across different voting methodologies over time, providing valuable insights into the evolving political dynamics and voter preferences within the state.